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Abstract 

The appropriate estimation of incurred but not reported (IBNR.) reserves is traditionally one of 
the most important task of actuaries working in casualty and property insurance. As certain claims 
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are reported many years after their occurrence, the amount and appropriateness of the reserves has 
a strong effect on the results of the institution. In recent years, stochastic reserving methods had 
become increasingly widespread. The topic has a wide actuarial literature, de scrib ing devel op ment 


mode ls and eva luation techniques. We on ly mention the summarizing article lEngland and Ve rral] 


2002 ] and book 


Wiithrich and Merz 


The cardinal aim of our present work is the comparison of appropriateness of several stochastic 
estimation methods, supposing different distributional development models. We view stochastic re¬ 
serving as a stochastic foreca st, so using the com parison techniques developed for stochastic forecasts, 
namely scores, is natural, see Gneiting and Raftery 2007], for instance. We expect that in some cases 
this attitude will be more informative than the classical mean square error of prediction measure. 

Keywords: run-off triangles, stochastic claims reserving, scores, Monte Carlo simulation 


1 Introduction 


The appropriate estimation of outstanding claims (incurred but not reported (IBNR) claims and reported 
but not settled (RBNS) claims) is crucial to preserve the solvency of insurance institutions, especially 
in casualty and property insurance. A general assumption is that claims related to policyholders, which 
occur in a given year, are reported to the insurance company in the subsequent years, sometimes many 
years later. Often, the payout is delayed as well. Thus, reserves have to be made to cover these arising 
obligations. 

Data are usually represented by so called run-off triangles. These are n x m matrices, where element 
X t j (*, j = 1,2,...) represents the claim amount incurred in year i. and paid with a delay of j — 1 years, 
or may also indicate aggregate values. In our research work we suppose that n = m, i.e., restrict data 
to be squared in that sense, for simplicity reasons. Although, thereby generality will not be violated, 
methods can be similarly used for arbitrary n and m positive integers. 

Elements for indices i + j > n + 1 are unknown, have to be predicted. If Xjj denotes an incremental 

value in the triangle, then the outstanding claims are A,,) — ^ ^i,j- We focus on the so 
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called ultimate claim value ( UC ), which is the sum of observed payments (upper triangle) and outstanding 
claims (lower triangle). Remark that throughout the entire article, Xij will denote incremental, and Ci t j 
will denote aggregate claim values, i.e., Cij = Xj -\ + ... + X. L j . 

The topic ha s a wid e actu arial lite rature , describing development models and evaluation techniques. 


See 


Thy lor 2000 1; 


Mack 


1994l| : 


Verral] 


1994| . for instance. After the quintessential works of these authors, 


stochastic claims reserving methods are becoming more and more common. These methods not only 
estimate the expected value of the outstanding claims, but also examine its stoch astic behavior. An 
extensive description of stochastic reserving can be found in the summarizing article 


2002] and book 


Wiithrich and Merz 


200 


England and Verral! 


Most of the stochastic reserving methods estimate the distribution of the amount of outstanding 
claims, so they can be considered as probabilistic forecasts. There are numerous methods for the com pari- 
son of probabilistic forecasts, and we chose to concentrate on the probabilistic scores, see 


Gneiting and Raftery 


2007] . The cardinal aim of our present work is to demonstrate that using these comparisons is an ad¬ 


equate way of comparing stochastic reserving methods. Our hope is that in some cases this attitude 
will be more informative than the classical mean square error of prediction measure or quantile values. 
We also will pay attention to the sharpness of predictions. The goal of our work is the comparison of 
appropriateness of several stochastic estimation methods, supposing different distributional development 
models. We used several run-off triangles from the actuarial literature. It is important to note that the 
results we present are not fully comprehensive, as it would far exceed the constraints of this article. For 
example, we only used paid run-off triangles, thus we could not make comparisons with the MCMC model 
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described in 

methods in 

Merz and Wiithrich 

2010]. Due to lack of capacity, we did not calculate with the bootstrap 

Biorkwall et al. 

2009 

and 

Pinheiro et al. 

20031, and we left out the MCMC method from 

Markus and Gvarmati-Szabo 

200 

7|- 


Of the scores, we used only the CRPS and the Energy score, since in most cases a Monte Carlo 
type evaluation is needed, as the explicit description of the distribution of the ultimate claim value is 
not possible or too difficult. For instance, this is the case if this value is the sum of random variables 
derived from log-normal distribution. We applied our comparison method for an itemized (claims and 
payouts) dataset from an insurance company, where both the upper and the lower triangles were known. 
We created 2000 scenarios with random draw and replacement from the claims, and chose the most 
appropriate stochastic reserving methods based on these scenarios. 


2 Distributional Models 


In the following section we briefly expound the distributional models used in our research. 


Wiithrich and Merz 


2008 . p. 168.]. Briefly, in the Log-normal 


2.1 Log-normal Model 

The following calculation is based on 
model, triangular elements Cij (*,j € Z+, i + j < n + 1) indicate cumulative claims. The so called 
Fij = c 1 ’’ :> ^ development factors are log-normally distributed random variables, with pj log-scale and 
<jj shape parameters, and let C^o be 1. Here we only remark that the parameter esti mation is given by 
the solution of Equation 12.11 [7~?1 except cf n := 0. For further details see 


Wiithrich and Merz 


200 ; 


n-j+1 


N = 


n - j + 


H 7 ! £ ‘ 0g 


i =1 


Ci, 


= —E +1 (V a ’ 


n — i 

J i =1 


Ci,j -1 


/u 


j e {!,.. .,n} 


j e {!,..., n- 1 } 


( 2 . 1 ) 


( 2 . 2 ) 


. C*,i—i. 

Obviously, the distributions of C) jn values - in other words the ultimate payments for accident year 

n n 

i - are also log-normal with log-scale parameter J) pk, and shape parameter crj(. Unfortunately, the 


fc=l 


fc=l 


distribution of UC = Ci tU cannot be expressed explicitely, since this is the sum of n log-normal 

i= 1 

variables. 


Wiithrich and Merz 


2008 . 


2.2 Negative Binomial Model 

The construction of negative binomial development model below is based on 
p. 183.|. Rows are assumed to be independent, and in each row, the distribution of Xjj increment value 
is Poisson— l)). Here, Qij-i ~ r(ci J _i, 1) random variable under the assumption that 

j 

{cij = Cij}, thus, the model is recursive. On the other hand, for j € {1,... ,?r}, ftj = ]T) 7 *, are the 

fc =1 

partial sums of 7 = ( 71 ,..., 7 n ) payout pattern. Let fj := -g-r^ (j — 1, ■ ■ •, n— 1), and suppose that f 3 > 1, 
i.e., 7 fc > 0 for all k , which means an increasing payout pattern. It can be shown that the distribution 
of increment A'y - conditionally on Cij -1 - is NegBinom^Cij-i, (j = 2,..., n). Remark that the 

NegBinom(r, p) distribution is defined as P(£ = k) = ( ,+ ^ _1 )p r (l — p) k (k = 0,1,...). 

Parameter estimation is based on maximum likelihood estimator as follows. Let 7 denote the unknown 
parameters, X the upper triangle, moreover, suppose that the first column is fixed. The likelihood function 
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IS 


n— 1 


n —1 


E(7, X ) — | P{rOWi) — 2 — ^i,2i • • • 5 ^-i,n— 7+1 — ^7,77—7+1) — 


7=1 


7=1 


n—1 


P(^-^-l,n — 7+1 ^7,77 — 7+1 |-^7,77 — 7 — ^7,77 — 7? ■ • • J -^7,2 — ^7,2) -^7,1 — ^7,l) • • • 

7=1 

• • • ‘ P( X i ,2 = ^ 7,2 |^ 7,1 = &i,l) = 

ff +",! fc ‘ ,n "‘ +1 ' - p»-.)*"- +i ■... 

' ^7,77 — 7+1 J 


7=1 


77—1 77— J 

nn 

1=1 *=1 


C,; i + ki 2 ^ l\ Ci,i ,, xfc 4 2 

t . 2 Jj>, (i — pi) = 


E fc i,i+i 

=1 —^ max. 

p 


Tt 5 CiJ 

const - _[] 1 (1 -pj) 4 = 

i=i 

Let £( 7 , X) be logL+X), i.e., the loglikelihood: 

77—1 77—J 77—1 n—j 

1 ( 7 , X) = const + EE Ci,j logPj + EE h,j +1 log(l -pj). 

j= 1 7 = 1 J = 1 7=1 

E E ^i 5 i+i 

Thus we get the following system of equations: -^-l = - +_ p — = 0, j € {1,.. .,n — 1}. 


Pi 


i- Pj 


n-j 


Suppose that Vj J] fcjj+i > 0. This gives the estimator 

i=l 


Pi = 


77—J 

7=1_ 

n—j 

E Ci,j+ 1 
2=1 


which relate to the chain-ladder development factors. Note that 71 = P 1 P 2 • ■ ..Pri-i> 7i = (1— Pi~i)pi .. .p n -i 
(i = 2 ,... ,n - 1 ), 7 „ = 1 - Vrt .—1 ■ 


At last, note that in 


E ngland and Verral l 


2002] another approach of the Negative Binomial Model 


can be found. According to that paper, a dispersion parameter is included, and increments are so called 
over-dispersed negative binomially distributed random variables with mean (fj — 1 ) ■ C+_i and variance 
- 1) • Cij-i, respectively. 


2.3 Poisson Model 


The Poisson model is available in 


Wiithrich and Mera . 120081 p. 25.], for instance. Briefly, suppose that 


the increments are independent Xij ~ Poisson{pi'yj ) variables, i,j = 1 ,...,n. Now suppose that the 
upper triangle - as a condition, denoted by D - is given. On the one hand, the estimation of 71 ,..., 7 „ 
payout pattern values works exactly the same way as in the Negative Binomial case. On the other hand, 


Pi = 


77 — 7+1 

E x ik 

fc=i 

77 — 7+1 

E 7 k 

k =1 


Since one of our aims is to generate ultimate claim values, assuming that the real parameters are 7 \ and 

77 

/E, one random ultimate claim variable is the sum of upper triangle values (or E Ci, 77-7+1) and E lower 

7=1 

( 77 \ 77 

J] Pi(%-i + 2 + ■ ■ ■ + 7 „) , i.e., UC = E Q,n-i+ 1 + Y. 

2=2 J 2=1 
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2.4 Over-dispersed Poisson Model 

A slightly more general model rather applied in practice is the Poisson model with a dispersion parameter. 
We have the following distributional assumptions. If Yjj ~ Poisson (^j^J random variable, let Xij := 
4>Yij. Thus E(Xij) = and Var(Xij) = iniCjj , and Xjj is called over-dispersed Poisson random 
variable. 

For p and 7 we used the regular chain-ladder method, which provides unbiased estimation. On the 
other hand, e valuation of cj> requires the determin ation of Pearson residuals, f or instance. For a detailed de¬ 
scription see Wuthrich and Merz, 20081 p. 218. 

Poisson case, Pearson residuals are defined as 


England and Verrall 


20061] . Briefly, in the over-dispersed 


f p = 

ij 


%ij 


Thus 


(j) P = 


E (<%)' 

i-\-j<n-\-1 

N — p 


where N denotes the number of observations, i.e., N = nindXL, anc j ^ num } )er Q f predicted unknown 
parameters, i.e., p = 2n — 1. Remark that this method provides a biased estimator for cf) (and also for p 
parameters). 

n 

The distribution of UC, similarly to the Poisson model is UC — Ci jn _i +\ + Y, where Y ~ (j) • 


i =1 


Poisson ( i Y + ■ ■ ■ + In) 


Remark that another parameterization of the model is as follows. Let 07 ,..., a n ; /3 \,..., /?„; c; <j> be 


parameters. Incremental claims are defined as 

v , - ■ - 

El • 


ij • - (j)-Poisson ^ e - + ^ 3 + ^. Thus E(Xij) = e ai+ ^ +c and Var(Xij) = (f>e ai+ ^ j+c . Our aim is to estimate 


a and /? parameters using maximum likelihood method under the usual constraint that 07 = /3i = 0 . 


2.5 Gamma Model 


England and Verrall'. 


20021 3.3] is the following. Increments are 


On the one hand, the model described in 
assumed to be Gamma distributed random variables, i.e., Xjj ~ T(a,/3), with expected value E(Xij) = 
niij and variance Var(Xij) = Thus parameters are a = -g and j3 = 


On the other hand, in 


Wuthrich and Merz 


20081 5.2.4] X^ increments are deterministic sums of 

r ij (k) (k) 

(rij) independent Gamma random variables. In other words, X.^ = Y where X\- ~ 


k =1 


Since the rate parameters are equal, the distribution of Xij is T(vrij, —Here, E{Xij) = r'ijTrijj and 
V or {Xij) = 

If rij = 1 Vi, j. the above mentioned two models are identical, and v = 4. The only difference 
is that r ^’s can be arbitrary integers, thus the second model is more flexible. Note that for handling 
estimation methods, it is not sat isfying to know the upper triangle, but the triangle of r. t j numbers too. 
Model Assumptions 5.19 in Wuthrich and Merz 2008] state that there exist /xi,... ,p n and 71 ,... , 7 n 


parameters under the constraint that Y 7* = 1> such that E{Xij ) = r t? m i? = pcij ■ The estimates /h and 

2=1 

7 j are averages of the observations weighted by numbers rij. For more details see Model Assumptions 
5.19. in the book. 

On the one hand, /x and 7 parameters are the solution of the following system of equations: 


hi = 


n+1 —i 

E 

j =1 


Xjj 
7 j 


n + 1 — * 


7 i = 


U + 1 -3 Y 

, Ai 
2=1 


n 


1 - J 
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(Note that for technical reasons, in our simulations we used simple chain-ladder method instead of 
solving the above mentioned system of equations.) On the other hand, the estimation of parameter v 
using Pearson residuals is the following. For i + j < n + 1 let 


thus 


f p 

ij 


%ij hi3 

hh 


E (tfs ) 2 

— (2n — 1 )' 


3 Stochastic Claims Reserving 

3.1 Parametric models 

We are using the parametric models introduced previously. We estimate the parameters from the upper 
triangle, and approximate the conditional distribution of the lower triangle by determining the conditional 
distribution with the estimated parameters. Naturally, this approach might have drawbacks, for example 
the prediction intervals will not be precise - to see the problem, we could just think of how to make a 
prediction interval for the (n + l)th member of an i.i.d. sample of normal distribution from the first n 
members. As the conditional distribution of the lower triangle is difficult, and in some cases, impossible 
to calculate analitically, we estimate the distribution with Monte Carlo method, generating 5000 lower 
triangles. 


3.2 Bootstrap methods with over-dispersed Poisson and gamma distributions 

1 2002 ). 


The detailed description of these models can be found in 


England and Verrall 


Appendix 3. These 


models are also included in the ChainLadder package for R. The heart of these methods is bootstrapping 
the adjusted Pearson residuals r •1 ^ \/ 2 . n {n+i)- 2 n+i ' r hi ^ ie u PP er triangle. From the resampled 
residuals, we create a new upper triangle, fit the standard chain-ladder to it, obtain the new expected 
values fhij and variances c/yfriij for the elements of the lower triangle, simulate the lower triangle, and 
store the ultimate claim amount. We do the bootstrapping 5000 times, and use the 5000 stored ultimate 
claim amounts as the predictive distribution. 


3.3 Semi-stochastic methods 

We use the models presented in Falukozv et al. 2007| . Consider ctj ( i ) = < ~^ J+1 , and let ay, j = 1, ...,n 
be independent discrete uniform random variables with P (ctj = ctj (i)) = . The model assumption is 

that for each element of the lower triangle, the C\,j + i = Cjj ■ ctj equation holds. It is easy to see that 

n—j „ 

q _ j S~l . 

E (ctj) = ^ 3 ^ g J+1 ■ The first method (denoted as Uniform) is to simulate a sufficiently large number 
2—1 

(in our case, 5000) of lower triangles, take the arousing 5000 ultimate claims, and use this empirical 
distribution as the predicted distribution. The second method (denoted as Unifnorm) is based on the 
assumption that the ultimate claim amount is nearly a normal random variable, and the task is to 
calculate the mean and variance. The mean is the expected value of the ultimate claim amount, which is 

X] ( Cn—j+lj n E (afe) J. 


1=1 


k—j 


ill I 11 — J. II — J. 

The variance of the ultimate claim is X GVi-j+ij II E ( a V) ~ II E 2 otk 


j =l 


. k=j 


k=j 
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actuary 

predictive distribution 

ideal 

LN(p, 1) 

long-term 

LN(0, 2) 

ordinary 

\LN{p, 1) + | LN{p + <5,1), where S = ±1 with probability |, \ 

intern 

f 4p + 1 p > 0 

LN(—\p\,a 2 ), where — \p\ + a 1 = p + 1, i.e. a 1 = < 

[l p< 0 


Table 4.1: Distributions regarding certain actuaries in Example 1 (Log-normal). £ ~ LN(p,l), where 


7V(0,1). 

actuary 

predictive distribution 

ideal 

Poisson(:r • A) 

long-term 

NegBm(1.5, 1 + i. 0 . 5 ) 

ordinary 

iPoisson(a; • A) + iPoisson(x • A • S), where S = 1 ± ^ with probability \ 

intern 

NegBin(2.T • A, |) 


Table 4.2: Distributions regarding certain actuaries in Example 2 (Poisson), i) ~Poisson(a: • A), where 
A ~ r(1.5,0.5) and x = 1000. 

4 Probabilistic Forecasts 


4.1 Introductory Example 

As we mentioned in the previous section, stochastic reserving is essentially the prediction of the distribu¬ 


tion of the future payments. The systematic analysis of these distributions begun with the_work of 


1984J. Most applications of probabilistic forecast theory is used in meteorology. 


Gneiting et al 


Dawid 

2007] 


give an overwiev of the most important methods for comparing the forecasts. To present these methods, 
we give 2 simple examples, the first of which is similar to the one in the aforementioned article. 

In Example 1, suppose that an automobile insurance company already has paid 1 million euros for the 
accidents occured in year 2012. Let £ be the total amount paid for accidents of 2012 till the end of 2013. 
Assume that £ is of log-normal distribution with log-scale and shape parameters p and 1 , respectively, 
where p is supposed to be a standard normal random variable. 

In the integer valued Example 2, the number of liability insurance claims for damages occured and 
reported in 2012 was 1000. Let rj be the number of claims for damages regarding 2012, that are reported 
in 2013. Assume that the distribution of p is Poisson with parameter 1000 • A, where lambda is a gamma 
distributed random variable with shape parameter 1.5 and scale parameter 0.5. 

We compare the performance of 4 imaginary actuaries predicting the distributions in the above ex¬ 
amples. The ideal actuary knows all the relevant circumstances, which means the knowledge of the exact 
value of p in Example 1, and A in Example 2 , respectively. The long-term actuary does not intend to 
get involved in the actual information, and simply estimates the unconditional distribution. The ordi¬ 
nary actuary attempts to estimate the parameters, but the estimation has some possible error. At last, 
the intern actuary finds the expected value, but does not care about the distribution itself. We give an 
overview of the distributions and the predicted distributions in Table 14.11 and Table 14.21 Year 2013 was 
simulated 10,000 times and the different probabilistic forecasts were compared according to them. 
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PIT histograms for Example 1 (sample size=10000) 
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Figure 4.1: PIT histograms for Example 1 (Log-normal example) 


4.2 PIT, Empirical Coverage, Average Width 

In the evaluation of the predictions, we have (Fj.Xj) pairs, where Fi denotes the predicted distribution 


Dawid 


in the ith case, and Xi the realization, respectively. 

Transform (PIT), and the description available in Gneiting et al 


19841 suggest ed using the Probability Integral 
is also worth seeing. The main 


20071 


idea is that substituting a random variable with continuous probability function into its own distribution 
function, will yield a uniformly distributed variable on the (0,1) interval. The distribution of PIT values 
has to be uniform on interval (0,1), otherwise the estimation is biased. Remark that the uniform rank 
histogram prope rty is a necessary but not sufficient criterion for ideal forecasts, as illustrated by an ex¬ 
ample in Hamill 2001]. For a fixed upper triangle - as a condition - we calculate the predicted distribution 


function of the ultimate claim, and substitute the real ultimate claim into it. 

We check the prediction by plotting the empirical CDF of the corresponding PIT values and comparing 
it to the identity function, see P-P plots below, or by plotting the histogram of the PIT values and 
checking for uniformity. On the one hand, U-shaped histograms indicate underdispersed, excessively 
light-tailed predictive distributions. On the other hand, D-shaped histograms hint at overdispersion, too 
heavy predictive distribution in tails, and skewed histograms occur when central tendencies are biased. 

In high count data cases this method can be used very well. In low count data cases, rand omize d PIT, 
or non-randomized uniform version of the PIT histogram is the more appropriate choice, see 


Czado et al. 


200 


fi||. PIT histograms regarding the two example can be seen on Figure [TTI and Figure ITT1 where 


dashed lines represent uniform distributions. These figures provide additional examples for inaccurate 
probabilistic forecasts with appropriate PIT histograms, since shapes regarding long-term and ordinary 
predictions deviate not more considerable from uniform distribution, as in the ideal case. Nevertheless, 
the intern case shows inappropriateness immediately. 

We also intend to define P-P plots for random variables £ and predictive distributions F. Namely, let 
s p := sup {2 : F(z) < p}, thus the P-P plot function ([0,1] —»• [0,1]) is as follows: p i-d P(£ < s p ). Remark 

Z 

that integrating the PIT function results the P-P plot, i.e., there is a bijection between the two concepts. 
It also follows obviously that a slanted-S shaped P-P plot corresponds to a D-shaped PIT, for instance. 
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PIT histograms for Example 2 (sample size=10000) 
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Figure 4.2: PIT histograms for Example 2 (Poisson example) 


An example for run-off triangles can be seen on Figure [5^1 Using the notations of our simulations, for a 


N 


real number p G (0,1), the function value is jj X{£<Qu(ri where Qu(r).j,p) is the p-quantile value 

i= 1 

of empirical distribution defined by pij,, r]Mj ■ 

At last, two linked measures will be considered for each probabilistic forecast. The idea derives from 


Baran et al 


2012], and an example calculation can be seen on Table HT21 Coverage in % related to a central 


prediction interval a ■ 100% is the proportion of observations between the lower and upper quan- 

N 

tiles. In accordance with our terminology, it can be written as jj J2 X{Qu(ri. j ,(i-a)/2)<z j <Qu(r l . j ,(i+a)/2)}- 


3=1 

The average width for a central prediction interval is the difference of lower and upper quantiles 

expressed in payments, in expectation. It can be interpreted as the sharpness of the prediction. If the 

performances of two stochastic claims reserving techniques are relatively close to each other, measured 

in scores, the one with the narrower width is the better. Formally we calculated the following way: 
n / \ 


W 2 ( rfU+aYM . — m . ) • Even though these could be derived from the PIT, in some applications 

j= 1 \ 2 >3 2 by 

it is important to know the exact values, thus a separate calculation is advised. See Table PTTTTI and Table 
14.41 regarding Example 1 and Example 2, respectively. It can be seen again, looking at the mentioned 
tables that inappropriate distributions may provide good results. Namely, the coverage of the long-term 
and ordinary actuary, and the average width of the intern actuary is acceptable as well. However, in our 
examples only the ideal gives suitable values for both measures. 


4.3 Continuous Ranked Probability Score and Energy Score 

Probability scores are an increasingly widespread technique to measure the quality of the predicted 
distributions. A score is a function of the predictive distribution and a realization of the real value. The 
predictive distribution can be represented by its CDF, empirical CDF or PDF, depending on which score 
is used. Forecast models are then ranked by comparing the average score of the predictive distributions 
from each model. 

The most highlighted measure of goodness of fit in this article is the so called Continuous Ranked 
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Actuary 

Coverage (%) 

Average width 

66% 

90% 

66% 

90% 

ideal 

67.0 

90.5 

1.93 

3.29 

long-term 

67.3 

90.5 

2.74 

4.65 

ordinary 

67.2 

90.4 

5.01 

11.64 

intern 

47.3 

74.2 

2.87 

4.88 


Table 4.3: Coverage and Average width in Example 1 (Log-normal) 


Actuary 

Coverage (%) 

Average width 

66% 

90% 

66% 

90% 

ideal 

66.2 

89.5 

48.91 

83.16 

long-term 

65.5 

89.5 

1052.00 

1868.00 

ordinary 

66.4 

89.5 

98.24 

140.62 

intern 

76.2 

95.2 

59.89 

101.84 


Table 4.4: Coverage and Average width in Example 2 (Poisson) 


Probability Score or CRPS, which is devoted to handle the case of prediction of distribution functions. The 
main reason of giving preference to the CRPS over other several score concepts is the general applicability, 
i.e., it can be applied to predictions (functio ns) regardl ess of absolute continuity or discrete behaviour. 
A detailed description can be found in 


Gneiting and Rafterv. 


2007 . p. 366.], for instance. Briefly, let F 


denote the predictive distribution function, and c one single observation. Here 


OO 

CRPS(F,c) = - J (F(y)-l {y > c} ) 2 dy. 

— OO 


(See Table 15.11 and Fi gure 15.31 1 A generalization of CRPS is the so called Energy Score , see the 


aforementioned 


Gneiting and Rafterv 


2007], for instance. The definition of one dimensional energy score 


ES(F,c) = -E F \X-X'f-E F \X-c\P, 


where X 7 X' are i.i.d. variables from distribution F, and /? G (0,2). It can be shown that for /3 = 1 we 
get the CRPS back. We note that CRPS can be evaluated directly, or in case of /? ^ 1, it is viable to 
approximate the energy score by sampling from the empirical distribution, which is typically a much less 
rapid way, because of the necessary sample size to be drawn to reach a satisfying accuracy. (During our 
simulations, the magnitude of difference in time was approximately 10 2 , although even so not significant.) 

Table 14.51 presents the score results in expectation, regarding the two examples (Log-normal and 
Poisson). For instance, consider the Log-normal example in case of the ideal actuary. Given {/r = m} 
and {£ = a;}, the score value is in the form J 0 °° (<t>(logy — m) — 1{ 2/ > CC })^ dy■ It turns out that the score 
measure provided proper ranking, although, the difference between ideal and intern values regarding the 
second example is not of distinction. 
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ideal 

long-term 

ordinary 

intern 

Example 1 

-0.48 

-1.86 

-0.63 

-1.83 

Example 2 

-14.49 

-327.62 

-32.62 

-14.64 


Table 4.5: CRPS results in Example 1 and Example 2 


4.4 Mean square error of prediction (MSEP) 

The mean square error of prediction (MSEP) of predictor (point estimation) X for ultimate claim X, 
conditionally on the u-algebra T is defined as 


msep x \AX ) =eUx- l) 2 | J-j = Var(X\T) + {X - E(X\E)) 2 . 


See Definition 3.1 in 


Wiithrich and Merz 


200 ; 


msepx(X) = E — xj ^ 


, for instance. The unconditional MSEP is 
= E(Var(X |JF)) + E((X - E{X\E)) 2 ). 


In the detailed examples the above defined errors can be calculated right away. 

For instance, actuaries according to Example 2 have mean square error of predictions as follows. 


msep v (ideal) = msep^ (intern) = E(msep v \\(x\)) = 

= E(Var(r]\\)) + E((x A — E(rj |A)) 2 ) = E{xX) = x ■ a ■ /3 


mseprj (long-term) = msep ri (E(ri)) = Varirf) 



= x ■ a ■ /3 • (1 + x ■ f3) 


mseprf (ordinary) = E(msep v \\(- ■ xX ■ (1 + 5))) = 
= E{Var{ V |A)) + E{{\ • xA • (1 + 6) - E(p |A)) 2 ) = 




1 + 


(1 + a) • /3 
400 


where a = 1.5, /3 = 0.5 denote the shape and scale parameters of the gamma distributions. Results in 
conjunction with the two simple example are shown on Table 14.61 The tables show that in the long-term 
case, the prediction is inaccurate even though the PIT histogram and the prediction interval coverages 
suggest that the fit is good. 



ideal 

long-term 

ordinary 

intern 

Example 1 

34.5 

47.2 

37.2 

34.5 

Example 2 

750.0 

375750.0 

3093.8 

750.0 


Table 4.6: Mean Square Error of Prediction in Example 1 and Example 2 


When we work with run off triangles, the definiton is the following. 

msepucwi ( UCj ) = Var(UC\V 3 ) + {UC 0 - E(t/C'|D J )) 2 j = l,...,N, 

for each real triangle. Let Xj be the real ultimate claims (j = 1,... , 7 V) and yij the generated ultimate 
claims (i = 1,..., M, j = 1,, TV). On the other hand, let Zij be an ultimate claim generated with the 
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real parameters and according to the real development distribution, conditionally on upper triangle T>j. 
Thus, in our calculations 


msepuciDjiUCj ) 


M 


E (% 

i=1 


E{UC \V 3 )f 


M - 1 


/ M 

E 

2=1 

M 

v 



Remark that if we calculate the msep values supposing the knowledge of real parameters instead of 
estimation, we get the pure Var(UC\T>j) back, since yi 3 = Zij. 

On the other hand, Table O shows MSEP values regarding the run-off triangle example detailed in 
Section [5] 


5 Simulation and Results 


5.1 Preliminary Remarks 

Our calculations have been implemented in R, and beyond a self made program code, we used the 
ChainLadder package. The documentation of the latter can be found on webpage 

http://cran.r-project.org/web/packages/ChainLadder/ChainLadder.pdf. The former is available 
on url http: //onrequest with a short user manual enclosed. Remark that the detailed simulation results 
for several parameter sets are also to find on this page. Moreover, the completion of tables and figures 
have been made using packages xtable and ggplot2. 

In this section, a Monte Carlo type method will be introduced, followed by simulated examples, con¬ 
sisting of corresponding goodness of fit values described in Section 2J Parameters of the example come 
from the run-off triangle RAA, which is an accumulated claims triangle from the Automatic Facultative 
business in General Liability, originally published in Historical Lo ss Develo pment, Re insurance Associa¬ 
tion of America (RAA), 1991, and was also used as an example in 


England and Verrall 


2002 ] and 


Mack 


19941 ]. It is crucial to emphasize that this well known triangle plays no role in the present article apart 


from providing parameters. In other words, during our research work we fitted several distributional mod¬ 
els only to get parameter values from real data, but without studying goodness of fits, which is another 
objective. Due to lack of space, only a special example is presented, but additional simulation results are 
available on the aforementioned webpage. Just to mention some claims data from l iterat ure , these a r e re¬ 
l ated to ABC (a workers’ compensation portfolio of a large company, first used in 


Barnett and Zehnwirth 


from 


20001 , which has an in-dept h analysis of the data array as well ), Genlns (a general claims data triangle 


Zehnwirth 


199j). 


Tayl or and Ashel 19831 ] . and was also used in 


Mack 


1993]) and M3IR5 (a simulated triangle, from 


5.2 Monte Carlo Type Method 

This subsection is devoted to the detailed description of our Monte Carlo type method (MC method) for 
comparison of various claims reserving methods in case of different distributional background of run-off 
triangles. In other words, our goal is to establish a ranking among the different stochastic claims reserving 
techniques, if the real development property of claims payments for accident years is in accordance with 
certain models, as described in Section [Distributional Modelsl for instance. This comparison now is not 
based on mean square error of prediction or quantile values, but on scoring rules. Nevertheless, the 
results and plots containing the former values are also included in the article, which allows us to compare 
them with the unconventional score results in theory of insurance. Remark that the important role in 

n 

our approach is played by ultimate claim values, i.e., the aggregate Ci, n payments for accident years 

i=l 
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1 ,,n. Although, if someone is more interested in focusing on solely the next year payments, i.e., in 

n 

Y X n -j + 2 ,j, the Monte Carlo type method can be interpreted easily without any special alteration. This 
3 =2 

latter reserving value is suggested by Solvency II. 

Suppose that the development distribution model of run-off triangle and corresponding parameters 
are given. As a first step, we generate N run-off triangles independently, and besides, for each the ultimate 
claim values UC\, UC 2 , ■ ■ ■, UCn- Let 'Di,T> 2 ,.. ■ ,T>n denote these upper triangles, as conditions to be 
used later. 

In the second step, for every generated upper triangle, we calculate the predicted distribution of the 
ultimate claims, using the methods described in IDistributional Modclsl As we mentioned there, we de¬ 
termine these predicted distributions via Monte Carlo method. Specifically, in the parametric models 
case, the parameters for each generated upper triangle have to be evaluated. They will certainly differ 
from each other, and from the real parameters as well. Assuming these parameter values for each con¬ 
dition T>i, or upper triangle, in other words, M ultimate claim values have to be generated, denoted by 
UCi t i, UCi t 2 , ■■■, U Ci t M 1 as stochastic predictors. Thus we get the predictive distributions. Following 
that, we prepare the forecasts using the 2 bootstrap, the uniform and the uniform normal methods. 

At last, in the third step, for each pair UCi and tuple (UCtp, UC. h 2 , • • ■, UCi t M) scores, PITs, msep 
and quantile values, and additionally the confidence intervals (coverage and average width) have to be 
calculated according to Section IProbabilistic Forecasts! Generally, a predictive distribution function F t 
derives from values UCi t \,UCi t 2 , ■ ■ ■, , and the corresponding c value is UCi. 

As an example, the results for regarding a Gamma Model example are shown on several figures below. 
N is chosen to be 2000, and M to be 5000. Figure 15.11 contains the histograms of 2000 PIT values. On 
the one hand, Figure 15.81 consists of boxplot representations of CRPS scores for each claims reserving 
technique. On the other hand, the mean CRPS values can be found on Table l5Tl Similarly, Figure [5~T1 
represents the energy scores for /3 = L. Remember that the higher score values mean more appropriate 
reserving methods. Remark that on boxplots, the ’reserving methods’ axis has to be interpreted as follows: 

1 - Log-normal 2 - Negative Binomial 3 - Poisson 

4 - Over-dispersed P. 5 - Gamma 6 - Uniform 

7 - Unif. Normal 8 - Bootstrap Gamma 9 - Bootstrap Od. Pois. 

10 - Ideal 

We have not mentioned method ’Ideal’ so far, which differs from all other methods essentially. Namely, for 
each distributional model, we assume that parameters are known, and use them in the above described 
second step, instead of any estimation. The reason of the inclusion of these results is that theoretically 
this means the best way of the prediction of claims reserves, in expectation. In other words, we used it 
for reference purposes, recall the example of ideal actuary. 

5.3 Results on a Gamma distributed example 

We now present the results for a Gamma distributional model. These can be interpreted in a number of 
different ways. 

1. Which score or error number is the most consistent, and how do they correlate with each other? 
Do the equipments applied in stochastic predictions choose the actual models better than regular 
measures, such as mean square error of prediction, for instance? 

2. Which non-parametric, distributional free methods predict the distributions properly? Do they 
outperform prediction methods derived from parametric models? 

3. How reliable and sharp are the prediction intervals? 
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Parameters are 


(/xi,..., n 10 ) = (21048,17507,23723,29562,25751,18680,15676,22141,19019,18402), 

( 7 !,..., 710 ) = (0.112,0.224,0.209,0.147,0.119,0.092,0.037,0.031,0.016,0.009), 

and v = 2.22. Based on the MSEP values on Tabic l5Jl the first thing that stand out is the poor fit of 
the Log-normal model. The other 4 parameter estimating models are roughly the same. The bootstrap 
methods are slightly worse, and the uniform and unifnorm methods are giving very poor results compared 
to the others. 


Res. Method 

CRPS (mean) 

En. Sc. (mean) 

MSEP (mean) 

MSEP (median) 

Log-normal 

-17840 

-77.47 

6.257e+09 

2.159e+09 

Negative Binomial 

-9878 

-81.03 

1.799e+08 

9.856e+07 

Poisson 

-10010 

-84.23 

1.799e+08 

9.862e+07 

Over-dispersed P. 

-7547 

-57.10 

1.800e+08 

9.911e+07 

Gamma 

-6911 

-54.28 

1.539e+08 

9.109e+07 

Uniform 

-20980 

-80.25 

6.144e+09 

4.716e+09 

Unif. Normal 

-50040 

-145.90 

6.139e+09 

4.711e+09 

Bootstrap Gamma 

-7856 

-57.82 

2.021e+08 

9.880e+07 

Bootstrap Od. Pois. 

-7865 

-57.84 

2.015e+08 

1.005e+08 

Ideal 

-4074 

-41.53 

5.296e+07 

5.295e+07 


Table 5.1: Scores and Mean Square Errors in the Gamma Model example 


If we take a look at the PIT histograms on Figure 15.11 we can see that the Negative Binomial 
and Poisson distributional model produces very poor fits, as we might have expected. In other words, 
these provide great examples for light-tailed predictions. The reason is that the model parameters imply 
e.g. a Poisson variable with a very high expected value, which makes the standard deviation very low 
compared to it. Hence the ultimate claim values will be in a very narrow range compared to the predicted 
distributions, leading to PIT histograms, where all the occurrences are in a 10-20 percent wide probability 
range of the predicted distributions. The other side of this attribute of the Poisson distribution is that 
when we use a Poisson distribution as the predicted distribution, the difference between the smallest and 
the largest value of the empirical predicted distribution will be very small, consequently the real ultimate 
claim values will most of the time be either bigger or smaller than all 5000 values of the predicted 
distribution. Thus, the corresponding PIT graphs will contain occurrences almost exclusively in the 5 
percent and 95 percent probability levels. In essence, the PIT analysis strongly suggests against using the 
Poisson distribution as the incremental claims of the IBNR claims in the current example, as even though 
the MSEP values are acceptable, the low variance attribute leads to the Poisson distribution seemengly 
being little upgrade over a simple point estimation of the expected value. Although, it is important to 
remark that considering real data, occurrence of nearly Poisson distributed triangles for pay amounts is 
not unprecedented. Looking at the fl-shaped histograms deriving from the bootstrap methods, there is 
an example for heavy-tailed predictors to be seen. 

When examining the P-P plots, we expected to see the best results, i.e., a graph closest to the identity 
function, when the real distribution is of the same type as the one in the MC method, and the calculations 
proved just that, see Figure 15721 Although, even if we get the Gamma model right, but do not know the 
real parameters, the prediction is slightly overdispersed. We can conclude that in terms of the consistency 
of quality of the P-P plots, the bootstrap methods provide quite acceptable results. The Uniform and 
Uniform normal semi-stochastic methods, even though we expected them to perform almost as well as 
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Figure 5.1: Histograms of PIT values in the Gamma distributed, example 


the bootstrap methods in the P-P plot test, are giving worse results. Regarding the Poisson distribution, 
we can come to the same conclusions as we did in the PIT analysis: it provides too little variance to 
effectively predict any other distribution, and predicting it with any other method, the variance of the 
resulting distribution would be too big to be considered a good fit. 

Regarding the stochastic methods in the example, the Gamma model is clearly the best in the score 
metric, followed by Negative Binomial, and bootstrap methods. See Figure HT51 and l5~H which contain the 
boxplot representations of CRPS and Energy Score values. Table loTTI summarizes the mean score values. 

On the other hand, the Uniform and Uniform normal methods give even worse results than if we were 
using a wrong distribution in any of the 5 MC methods, so even though those 2 do not require us to 
guess the underlying distribution, their results are worse then if we were making the wrong guess, and 
then build a parameter approximating MC model based on that guess. 

Table 15.21 presents the inaccuracy of the applied prediction intervals, moreover, the other run-off 
triangles and models gave similar results in this sense. In cases where the coverage is said to be acceptable, 
the sharpness is mostly inappropriate, i.e., intervals are wide. The coverage values for Log-normal model 
are closer to the 66% and 90% values, then in the Gamma case, but average width results are higher. 

5.4 Public payments data example 

In case of the detailed dataset provided by an insurance company, we proceeded not exactly as in the 
analysis of the known run-off triangles. Which means, each of the 2000 real run-off triangles were generated 
the following way. We used sampling with replacement 43, 081 times from the set of 43, 081 accidents, 
which determines a triangle. Moreover, it resulted in a 6 x 6 quadrangle also, and in the real ultimate 
claim, respectively. Following this simulation, we did exactly as described before, i.e., determined the 
predictive distributions for the 2000 triangles with several methods, and compared them to the real 
ultimate claim values. 

Figure [ST5] shows that in the light of PIT values, none of the methods based on distributional models 
are recommended. Best histograms are resulted by the bootstrap estimation methods. Although uniform 
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Figure 5.2: P-P Plots in the Gamma Model example 


and uniform normal methods are not much worse, they underperform in the cases of extreme claim values. 
It is also confirmed by Table [5751 i.e., the proportion of prediction intervals of bootstrap methods are 
close to the 66.67% and 90% values. Unfortunately, in exchange for the appropriate coverage percentages, 
average width values are higher. 

On the one hand, according to the CRPS values in Table f5~Tl the uniform claims reserving method 
has the highest score, which has not been performing well for other triangles. On the other hand, energy 
scores are slightly better for the bootstrap methods. It has to be mentioned that the msep values are the 
highest in the bootstrap cases, in other words, this measure implies a different ranking. 


6 Conclusions 


We conclude our article by trying to answer the 3 questions proposed in the previous section. The method 
that gives the best results overall, and is the least sensitive to the underlying distribution is the bootstrap 
gamma and bootstrap over-dispersed Poisson method. Had we chosen a different one for our paper, we 
might have deduced that the 2 bootstrap methods are the same in strength. It also became clear that the 
Uniform and Uniform normal methods are significantly worse than the bootstraps in every evaluation. 
The 5 parameter approximating MC methods give good results when we guess the underlying distribution 
correctly, and as the P-P plots and PIT values show, they can be much worse then the bootstraps when 
applied to the wrong distribution. The PIT and P-P plots advised against the use of the Poisson model 
for the RAA dataset. When we studied other datasets, we found that the Poisson model exhibits the 
same limitations, however, the Log-normal model was just as good as the other distributions. In general, 
scores provide a reasonable fit to the background distribution. Unfortunately, the methods used in this 
paper - and also applied in insurance industry - result not reliable prediction intervals. 

This negative result is not surprising, since a typical run-off triangle contains much less information 
than the unknown parameters. To improve the predictions. iMevera [20Q7| proposed the usage of Bayesian 
methods. In our opinion, other than these methods, it would also be worthwhile to try using individual 
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Figure 5.3: Boxplots of CRPS values in the Gamma Model example 
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contract and claim data in probabilistic forecasts. 

In our work, we used the individual claim dataset of an insurance company as well, but not for 
probabilistic forecasting, but to propose a technique for comparing stochastic methods. This technique is 
not sufficiently mathematically established yet, but may be applied in practice. One can simulate many 
scenarios from the past, and choose a model, which best fits the claims data of the insurance institution. 

We hope our work helped to show that in stochastic claims reserving, both in theoretical and in 
applied situations, it is worthwhile to test the quality of the different methods, and in multiple ways if 
possible. 

As it was mentioned before, our goal was not the construction of the best stochastic claims reserving 
technique, but to propose an adequate methodology for comparisons in the future. We hope to take the 
first step in this direction with our work. 
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Figure 5.4: Boxplots of Energy Score values in the Gamma Model example 
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Reserving method 

Coverage (%) 

Average width 

66.67% interval 90% interval 66.67% interval 90% interval 

Log-normal 

63.4 

81.1 

89035 

245659 

Negative Binomial 

3.3 

5.4 

919 

1562 

Poisson 

1.4 

2.8 

444 

755 

Over-dispersed P. 

48.7 

71.0 

15497 

26322 

Gamma 

50.4 

73.6 

15506 

26533 

Uniform 

75.4 

97.0 

111512 

422161 

Unif. Normal 

95.9 

100.0 

287986 

489479 

Bootstrap Gamma 

86.6 

98.4 

39946 

70131 

Bootstrap Od. Pois. 

86.6 

98.5 

39951 

70112 

Ideal 

66.8 

89.4 

13967 

23821 


Table 5.2: Coverage and Average width in the Gamma Model example 
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Figure 5.5: Histograms of PIT values in the Public Data case 

Reserving method 

Coverage (%) 

Average width 


66.67% interval 

90% interval 

66.67% interval 

90% interval 

Log-normal 

19.2 

31.9 

1261338486 

2170190302 

Negative Binomial 

0.0 

0.0 

573564 

974878 

Poisson 

0.0 

0.0 

163897 

278579 

Over-dispersed P. 

15.2 

26.7 

954272655 

1621537701 

Gamma 

23.8 

40.0 

1395724439 

2375100112 

Uniform 

56.2 

72.5 

3228640816 

4738399784 

Unif. Normal 

38.8 

59.9 

1963981053 

3337926040 

Bootstrap Gamma 

61.5 

87.1 

4490491606 

8679547627 

Bootstrap Od. Pois. 

61.6 

87.1 

4492743160 

8679552105 

Table 5.3: 

Coverage and Average width in 

the Public Data 

case 

Reserving Method 

CRPS (mean) En. Sc. (mean) 

MSEP (mean) 

V1SEP (median) 

Log-normal 

-1.526e+09 

-27780 

5.420483e+18 

3.699599e+18 

Negative Binomial 

-1.817e+09 

-38850 

5.348926e+18 

3.671239e+18 

Poisson 

-1.817e+09 

-38970 

5.348924e+18 

3.671239e+18 

Over-dispersed P. 

-1.579e+09 

-28820 

5.350275e+18 

3.671087e+18 

Gamma 

-1.397e+09 

-25880 

4.803373e+18 

3.239738e+18 

Uniform 

-1.227e+09 

-23530 

4.101087e+18 

2.726758e+18 

Unif. Normal 

-1.231e+09 

-23600 

4.100103e+18 

2.723995e+18 

Bootstrap Gamma 

-1.347e+09 

-23470 

7.204460e+19 

5.300120e+18 

Bootstrap Od. Pois. 

-1.347e+09 

-23470 

4.082055e+19 

5.308000e+18 


Table 5.4: Scores and Mean Square Errors in the Public Data case 
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